Goto

Collaborating Authors

 Constanța


Romania to expel Russian consul after residential drone strike

Al Jazeera

Romanian President Nicusor Dan says that the Russian consul in the southeastern city of Constanta will be expelled and the consulate shut down after a drone intended for Ukraine crashed into an apartment complex in the border town of Galati. Vance says US, Iran have made "a lot of progress" towards deal


Algorithmic warm starts for Hamiltonian Monte Carlo

arXiv.org Machine Learning

Generating samples from a continuous probability density is a central algorithmic problem across statistics, engineering, and the sciences. For high-dimensional settings, Hamiltonian Monte Carlo (HMC) is the default algorithm across mainstream software packages. However, despite the extensive line of work on HMC and its widespread empirical success, it remains unclear how many iterations of HMC are required as a function of the dimension $d$. On one hand, a variety of results show that Metropolized HMC converges in $O(d^{1/4})$ iterations from a warm start close to stationarity. On the other hand, Metropolized HMC is significantly slower without a warm start, e.g., requiring $Ω(d^{1/2})$ iterations even for simple target distributions such as isotropic Gaussians. Finding a warm start is therefore the computational bottleneck for HMC. We resolve this issue for the well-studied setting of sampling from a probability distribution satisfying strong log-concavity (or isoperimetry) and third-order derivative bounds. We prove that \emph{non-Metropolized} HMC generates a warm start in $\tilde{O}(d^{1/4})$ iterations, after which we can exploit the warm start using Metropolized HMC. Our final complexity of $\tilde{O}(d^{1/4})$ is the fastest algorithm for high-accuracy sampling under these assumptions, improving over the prior best of $\tilde{O}(d^{1/2})$. This closes the long line of work on the dimensional complexity of MHMC for such settings, and also provides a simple warm-start prescription for practical implementations.



daff682411a64632e083b9d6665b1d30-Supplemental-Conference.pdf

Neural Information Processing Systems

Many high-dimensional statistical inference problems are believed to possess inherent computational hardness. Various frameworks have been proposed to give rigorous evidence for such hardness, including lower bounds against restricted models of computation (such as low-degree functions), as well as methods rooted in statistical physics that are based on free energy landscapes. This paper aims to make a rigorousconnectionbetween the seeminglydifferent low-degreeand free-energybased approaches. We define a free-energybasedcriterionfor hardnessand formallyconnectit to the well-establishednotionof low-degree hardness for a broad class of statistical problems, namely all Gaussian additive models and certain models with a sparse planted signal.


ky Xvk

Neural Information Processing Systems

Wefocusonsixmethods:(i)discriminative K-means (DisKmeans) in Ye et al. (2008); (ii) a discriminative clustering formulation described inBach andHarchaoui (2008); Flammarion etal.(2017); We compare two classesF of feature mappings: linear functions and fully-connected neural networks with one hidden layer that has 100 nodes. An epoch refers ton/B = 12 consecutive iterations. The learning curves in Figure 1 shows the advantage of neural network and demonstrates the flexibility of CURE with nonlinear function classes. One of the main obstacles is the complicated piecewise definition off, which prevent us from obtaining closed form formulae.



1cf760a547822e2b8276881ad45f0fe9-Paper-Conference.pdf

Neural Information Processing Systems

Such a question is very relevant: boostinghas quickly evolvedasatechnique requiring first-order information about the loss optimized [6, Section 10.3], [41, Section 7.2.2]




Near-OptimalRandomizedExplorationforTabular MarkovDecisionProcesses

Neural Information Processing Systems

These algorithms inject (carefully tuned) random noise to value function to encourage exploration. UCB-type algorithms enjoy well-established theoretical guarantees but suffer from difficult implementation since an upper confidence bound isusually infeasible for manypractical models like neural networks. Instead, practitioners prefer randomized exploration such as noisy networks in [19], and algorithms with randomized exploration have been widely used in practice [37,13,11,35].